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ABSTRACT 

An approach that is currently gaining popularity in 
educational measurement is one that treats item response theory (IRT) 
as a special case of nonlinear factor analysis (NLFA) . A brief 
overview is provided of some of the research that has examined the 
relationship between IRT and NLFA. Three NLFA models are outlined, 
emphasizing their major strengths and weaknesses. These are: (1) 
McDonald's polynomial approximation to a normal ogive model; (2) the 
factor analytic model for dichotomous variables of Christof f ersson 
and Muthen (1975, 1984); and the full-information factor analytic 
model of Bock and Aitkin (1981) and Bock, Gibbons, and Muraki (1988). 
Although the full-information factor analytic model appears to be the 
strongest of the approaches described, it appears that a greater 
number of empirical studies should be undertaken to compare the 
various NLFA models with respect to how accurately they can recover 
simulated parameter values. An appendix presents an IRT-NI.FA proof. 
(Contains 61 references.) (SLD) 
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Abstract 

Item response theory (IRT) models have been used extensively to address 
educational measurement and psychometric concerns pertaining to a host of areas 
such as differential item fumctioning, equating and carputer- adaptive testing. 
The many advantages of IRT models (e.g. , item and ability parameter invariance) , 
have contributed to their use in a wide number of areas by practitioners and 
researchers alike. 

Another approach which is currently gaining popularity in educational 
measurement is the one that treats item response theory as a special case of 
nonlinear factor analysis (NLFA) . Several authors have shown that these mcdels 
are mathematically equivalent (Balassiano & Ackerman, 1995a; 1995b; Goldstein & 
Wood, 1989; Knol & Berger, 1991; McDonald, 1967; 1985; 1989; in press) . It would 
therefore appear reasonable to make use of NLFA models to examine a multitude of 
educational measurement problems which had been, until quite recently, looked at 
solely from an IRT perspective. 

The purpose of this paper is twofold: 

First, to provide a brief overview of seme of the research that has 
examined the relationship between IRT and NLFA; 

Second, to outline three NLFA models, eirphasizing their major strengths and 
weaknesses. More precisely, McDonald's (1967; 1982b) polynomial approximation to 
a normal ogive model, Christof fersson' s (1975) / Muthen' s (1984) factor analytic 
model for dichotomous variables as well as Bock and Aitkin's (1981) /Bock, Gibbons 
and Muraki's (1988) full -information factor analytic model, will be summarized. 



Nonlinear FA and its relationship to IRT 

3 

Introduction 

Over the past three decades, the educational measurement and 
psychometric literatures have been replete with studies focusing on item 
response theory (IRT) models. The numerous textbooks that have been written 
centering primarily anu, in some instances, exclusively on IRT attest to the 
importance of these models in the development and analysis of tests and items 
(Baker, 1992; Hambleton, 1983; 1989; Hambleton & Swaminathan; 1985; Hulin, 
Drasgow, & Parsons, 1983, Warm, 1978) . The use of IRT models has been 
widespread in both testing organizations and departments of education for a 
variety of purposes such as item analysis (Baker, 1985; Mislevy & Bock, 1990; 
Wingersky, Patrick, & Lord, 1991; Thissen, 1993) , score equating (Cook & 
Eignor, 1983; Lord, 1977; 1980; 1982; Petersen, Kolen, & Hoover, 1989; Skaggs 
& Lissitz, 1986) , differential item functioning (Thissen, Steinberg, & Wainer, 
1993) and corrputer adaptive testing (Hambleton, Zaal, & Pieters, 1993; 
Kingsbury & Zara, 1991; Wainer et_ai. , 1990) , to name a few. The many 
properties of IRT models, among them, that "satrple-f ree" item parameter 
estimates and "test-free" ability estimates can be obtained, have generated 
considerable interest in their use to solve a host of measurement -related 
problems . 

Another approach which is currently gaining popularity in educational 
measurement is the one that treats item response theori*- as a special case of 
nonlinear factor analysis (NLFA) . Several authors have shown that these models 
are mathematically equivalent (Balassiano & Ackerroan, 1995a; 1995b; Goldstein 
& Wood, 1989; Knol & Berger, 1991; McDonald, 1967; 1985; 1989; in press) . 
Muthen (1978, 1983, 1984) has also demonstrated that commonly used models in 
IRT (e.g. the two-parameter normal ogive model) are really specific cases of a 
more general factor analytic model for categorical variables with multiple 
indicators (i.e. response categories). McDonald (1982b), starting from 
Spearman's comntvon facte model, also shows that IRT models are a special case 
of NLFA and provides a general framework which includes unidimensional/ 
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multidimensional, linear /nonlinear models as well as dichotomous and 
polychotomous models. 

Takane and De Leeuw (1987) have also established that IRT models are 
mathematically equivalent to NLFA. These authors have provided a systematic 
series of proofs that show the equivalence of these models with dichotomous as 
well as polychotomous item responses. 

Thus, it appears as though IRT and NLFA models represent two equivalent 
formulations of a more general latent trait model. Indeed, the two terms are 
often used interchangeably. For exaitple, the model proposed by Bock and Aitkin 
(1981) has been synonymously referred to as full -information factor analysis 
(Bock, Gibbons, & Muraki, 1988) and multidimensional IRT (McKinley, 1988) . 
Given the equivalence of IRT and NLFA, it would appear reasonable to make use 
of the latter models to examine a multitude of educational measurement 
problems which had been, until q.iite recently, looked at solely from an IRT 
perspective. Several nonlinear factor analytic models, with potential 
applications to measurement and psychometric issues, have been proposed in the 
literature (Bock & Aitkin, 1981; Bock, Gibbons, & Muraki, 1988; Bock & 
Lieberman, 1970; Christof fersson, 1975; McDonald, 1967; 1982b; Muthen, 1978; 
1984) . 

The first part of this paper will consist in providing a brief overview 
of some of the research that has examined the relationship between common IRT 
models and NLFA. 

Three NLFA models that have been used to address measurement related 
issues will be presented in the second part of this paper. Specifically, 
McDonald's (1967; 1982b) polynomial approximation to a normal ogive model, 
Christoffersson's (1975)/ Muthen' s (1978) factor analytic model for 
dichotomous variables as well as Bock and Aitkin's (1981) /Bock, Gibbons and 
Muraki 's (1988) full-information factor analytic model, will be suitmarized. In 
addition, some of the strengths and weaknesses of the models will be 
highlighted. 
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The relationship between ccDnon IRT models 
and nonlinear factor analysis 

A considerable body of research has been dedicated to examining the 
relationship between common IRT models, e.g., logistic and normal ogive 
functions, and NLFA {Bari.holomew, 1983; Goldstein & Wood, 1989; Knol & Berger, 
1991; McDonald, 1967; 1989; Takane & De Leeuw, 1987) . 

Bartholomew (1983) has provided a general latent trait model on which 
several IRT as well as factor analytic functions for dichotomous variables are 
founded. The author states that common factor analytic models, such as those 
proposed by Bock and Aitkin (1981) , Christoffersson (1975) and Muthen (1978) 
are special cases of this general latent trait model. The model is of the 
form, 

g 

G{ni{y))=aio+Y^a^jH(yj) , i=l,2...,p. (D 

Bartholomew states that the models outlined by Bock and Aitkin (1981) , 
Christoffersson (1975) and Muthen (1978) use the probit function, 
{G(u)=t"Mu)} for both G and H. Lord and Novick (1968), whose discussion on 
IRT models is restricted to the q=l (i.e., unidimensional ) case, treat y^g as 
parameters and use the logit for G and the probit for H. "Translated" in the 
unidimensional IRT vernacular, the terms in equation 1 would correspond to the 
following: 

G(Tii)= the response function outlining the probability ci 

obtaining a correct response to item i; 
(y)= a vector of ability (in this case, a scalar, given 

that q=l) ; 

aio= a parameter related to the difficulty of item i; 

ffij = a parameter related to the discrimination of item i . n 

latent trait j ; 

H(yj)= The density function for a given latent trait j. 
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Fraser and McDonald (1988) and McDonald (1981; in press) also examined 
the relationship between conimon item response functions (IRF) and NLFA. 
McDonald (1994) states that the unidimensional normal ogive model given by, 

P(Yi=l\Qj) =Ci+{l-Ci)N[ai{dj-bi)] . (2) 

where , 

0j= the latent variable; 

hi = the d value at the point of inflexion of the item 

response function; 
ai = the slope of the IRF at its point of inflexion; 

Ci = the lower asymptote value of the IRF; 

N{ . } = the normal distribution fmction; 

can be re-expressed using the latent trait parameterization as, 

P(y-l|8j) =Ci+(l-Ci)i\^[.fi,+fi,ej] , (3) 

with = -a^bi and f^j = a^; corresponding to the factor loading of factor 
1 on item i. Function (3) can be generalized to the multidimensional case, 

P(yi=i|0) =Ci+(i-Ci)i\^[fio+^',Q] . (4) 

Fraser and McDonald (1988) and McDonald (1981; in press) also 
demonstrated that the latent trait model shown in (4) could be derived (c.f . 
Christoffersson, 1975) in the form, 

P(i=l|Q) =Ci+(l-Ci)i\^[tio+i2^fi//nJ . (5) 

The parameters in models (4) and (5) are related by. 
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mj =1 / sj. 



where (i=l, ... , xn) and P is the xn x xn matrix containing the correlations 
among the dimensions (assuming the latent traits have been standardized) . A 
detailed discussion of this relationship is found in McDonald (1985) . 

Knol and Berger (1991) examined the relationship between several NLPA 
models and logistic IRT fianctions. More precisely, the authors focused their 
attention on cotrparing Bock and Aitkin's (1981) full -information factor 
analytic model and McDonald's (1967) ];x>lynomial approximation to a normal 
ogive model, to the two-parameter logistic IRT fianction. 

Bock and Aitkin's model (1981) uses a marginal maximum likelihood 
procedure in the estimation of item parameters. In the model, X = [Xi, . . . , 
X„) ' corresponds to a random response pattern vector to n binary variables, 
where each Xi(i=l, . . . ,n) value is defined as 1, if the item is correctly 
answered and 0, if incorrectly answered. Under the assuirption of local 
independence, the marginal probability of the response vector X = x is given 
by, 



P{x=x) = I ft [p, (8) ] ^Ml-Pi (8) ] '"^^g(8) d8. 



(6) 



where Pi(6) corresponds to the item characteristic function of item i, gO) is 
the density function of the latent xn-cotrponent random vector of abilities 6, 
and the integration is tak-n over the entire multidimensional ability space. 
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Knol and Berger (1991) state that if 0 is assumed to be multivariate 
normal distributed with a mean equal to 0 and a covariance matrix I, the 
multidimensional two-parameter normal ogive model ICF for item i (i=l, ...,n) 
is given by, 

p,(e)=F(a'i0-p,), (7) 

where , 

tti = the m X 1 vector of item discrimination parameters; 

Pi = the item difficulty parameter; 

F{.)= the cumulative standard normal distribution. 

Knol and Berger (1991) also examined the relationship between McDonald's 
polynomial approximation to a normal ogive model (McDonald, 1967; 19B2b) and 
the tv'o-parameter logistic IRT function. McDonald, using harmonic analysis, 
proposed a NLFA model that is based on the pairwise joint -proportion of the 
item responses. The ICFs for this model are approximated by a third degree 
Hermite-Tchebychef f polynomial. The pairwise probabilities = P(Xi=l, Xj=l) 
are estimated by minimizing the unweighted least-squares function, 

f{A,^)-Y,E [Pir^i.(«i'Pi'«.'P.)]'' (8) 

where, 

A = (ui, . . . , Un) ' , described in (7) ; 
P = (Pi, /JJ', also defined in (7) 

Pi J = the observed joint -proportions. 

As Knol and Berger (1991) state, the relationship between the logistic 
distribution function L(.) and the cumulative standard normal distribution 
function F(.) given by (Mood, Grai*)ill, & Boes, 1974), 

|F(z) -L(l.7z) |<.01 (9) 

for all z (Haley, 1952) , makes it possible to approximate the normal ogive ICF 
by the logistic ICF, 
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p,{Q )=-^ 1_ = L[1.7 (a'i0,-p,)] . (10) 

Takans an.d De Leeuw (1987) also showed that common IRT models and NLFA 
models are formally equivalent. These authors have provided a proof that 
demonstrates the equivalence of Bock and Aitken's (1981) full -information 
factor analytic model to Cliristoffersson's (1975) /Muthen's (1984) generalized 
least-squares factor analytic model for dichotomous variables. This proof is 
presented in Appendix A for the reader's benefit. 

Summary 

The purpose of the first part of the paper was to briefly outline past 
research that has investigated the relationship between common IRT models and 
NLFA. These studies have shown that logistic and normal ogive functions are 
formally equivalent to McDonald's (1967; 1982b) polynomial approximation to a 
normal ogive model, Cliristoffersson's (1975) /MuLhen's (1984) factor analytic 
model for dichotomous variables, and the full -information factor analytic 
approach advocated by Bock and Aitkin (1981) as well as Bock, Gibbons and 
Muraki (1988) . Hence, based on the IRT-NLFA relationship, it would appear that 
these latter models might provide a useful framework with which common 
measurement and psychometric problems can be addressed. A summary of these 
three NLFA models is provided in -.he next section of the paper, emphasizing 
some of the advantages and limitations of each approach for the practitioner. 

A polyncmial approximation to a normal ogive nodel 

McDonald (1967; 1982a; 1982b, 1989; in press) and McDonald and Ahlawat 
(1974) have provided a general framework that enables the organization of 
existing unidimensional as well as multidimensional IRT models based on a more 
general NLFA approach. Specifically, generalizing from Spearman's common 
factor model, McDonald (1982b) has presented three classes of models which can 
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be used in educational measurement, that is, 



i. models that are strictly linear in both their coefficients and 
latent traits; 

ii. models that are linear in their coefficients but not in their 
latent traits; 

iii. models that are strictly nonlinear. 

McDonald's distinctive contribution to the area, however, lies with the 
second class of models presented. McDonald and Ahlawat (1974) have proposed a 
group of regression functions that are linear in their coefficients (i.e. 
their item parameters) but nonlinear in their latent traits, of the general 
form, 

t s 

. . . =a_io + 52 E ^iip^p(^i) (i=l,...22), (11) 

1=1 p=i 

where , 

fi (Xi, . . . = a function that represents the probability 

that an examinee with latent trait values 

Xi, . . . will correctly respond to the 

ith binary iterr,; 
aio = An intercept parameter of the regression 

function for item i; 
aiip = A regression coefficient for item i on 

latent trait 1 of the p-th polynomial 

degree; 

hpi^i) = a general polynomial function of the form, 

^iA^^i.e^...-f,,e^ (12) 

An IRT model v'^iah describes the probability that a randomly selected 
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examinee j of ability 0^ will correctly answer an item is the two-parameter 
normal ogive model. The item characteristic curve (ICC) for the model is given 
by, 




(13) 



where t is the normal deviate. One common parameterization of Zij for item i 
is, 

Z^j=ai{Qj-bi) , (14) 

where and have been previously defined in (2) . McDonald (1967) , using 
harmonic analysis, has shown that the normal ogive model could also be 
approximated as closely as desired by a polynomial series of the general form, 

whf :e, fik is the factor loading of factor k on item i. 

The unweighted least squares (ULS) function that is minimized to enable 
the estimation of the pairwise probabilities n^j = P(Z£=1, Z,=l) is, 

f(A,p)=53$3 [p,.-ft,.(a,,p,,a,.,p,.)]^ (16) 

with A, p and Pi^ previously defined in (7) and (8) . As was stated earlier, 
the ICFs for this model are approximated by a third-degree Hermite-Tchebychef f 
polynomial. A few advantages and limitations of the model are presented in the 
next section of the paper. 
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Advantages and limitations of McDonald's polynanial approximation to a normal 
ogive model 

As was previously stated, McDonald's (1967) approach to NLFA uses ULS 
estimation of the model parameters. ULS estimation is quite economical as 
coapared to generalized least-squares and maximum likelihood procedures and 
hence has the practical advantage of allowing for the -analysis of tests with a 
fairly large number of items and/or dimensions. 

Also, McDonald's model has been implemented in the conputer program • 
NOHARM (Eraser & McDonald, 1988) . The program enables the user to fit 
confirmatory or exploratory unidimensional and multidimensional models to item 
response matrices. The output from a typical NOHARM run includes the results 
for the latent trait parameterization, the common factor model 
reparameterization as well as, in the unidimensional case. Lord's 
pai'ameter ization (i.e., a vector of discrimination parameters, a, and 
difficulty parameters, Jb, are provided) . In addition, a residual joint- 
proportions matrix is included in the output which can be useful to assess the 
fit of a given model. 

However, the greater degree of coaputational efficiency associated with 
the ULS estimation procedure is achieved at the sacrifice of information 
(Mislevy, 1986) . That is, only the information in the one-way marginals 
(percent-corrects) and two-way marginals (joint percent -corrects) is utilized 
by NOHARM in the estimation of parameters, thus explaining why it is often 
referred to as a "limited" or "bivariate" factor analytic method. However, 
McDonald (in press) and Muthen (1978) have suggested that one should not lose 
too much information in the absence of higher-order marginals. Also, Knol and 
Berger (1991) conpared NOHARM parameter estimates to those obtained based on a 
full-information factor analytic model (i.e., using TESTFACT; Wilson, Wood. & 
Gibbons, 1987) and generally found only slight differences between the two 
procedures with respect to their ability in recovering (simulated) factor 
analytic parameters. However, these findings were based on a limited number of 
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replications (10) and should be interpreted cautiously. Nonetheless, from a 
practical perspective, it would seem that there might not be much to be gained 
in using full -information methods. Balassiano and Ackerman (1995b) have also 
shown that the overall performance of NOHARM, with respect to recovering 
simulated item parameter values, was satisfactory, even with small satiple 
sizes (N = 200) . 

Another limitation of the model, again attributable to the ULiS 
estimation procedure, is the absence of standard errors for the parameter 
estimates and a fit statistic for the given model. However, McDonald (1994) 
and Balassiano and Ackerman (1995b) have suggested criteria (e.g., the inverse 
of the square root of the sample size) that may be used as approximate 
standard errors for the parameters of the model. Also, two approximate 
statistics, based on the residuals obtained after fitting a NLFA (NOHARM) 
model to an item response matrix, were proposed and investigated by De 
Chanplain (1992) and Gessaroli and De Chartplain (1995) . Results obtained with 
a variety of simulated data sets showed that the approximate statistics 
were quite accurate in correctly determining the number of factors underlying 
simulated item responses. This would suggest that these procedures might be 
useful as practical guiaes for the assessment of model fit, even though they 
are perhaps not the theoretically preferred statistics due to the ULS 
estimation method on which they're based. However, further research needs to 
be undertaken in order to evaluate the behavior of these approximate in a 
larger number of conditions before making any definite statements about their 
usefulness . 

Finally, some authors have no*- ' 1 that a problem with McDonald's model is 
the absence of an index that would indicate the appropriate number of 
polynomials to retain in a series (Hambleton & Rovinelli, 1986) . Findings 
pertaining to this question, however, seem to indicate that terms beyond the 
cubic can generally be dismissed (McDonald, 1982b, NandaJuimar, 1991) . 
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A factor euialytic model for dichotcmous Vcuriables 

Christof fersson (1975) and Muthen (1978) proposed a factor analytic 
model for dichotcmous variables in which it is postulated that response 
variables are accounted for by the latent continuous variables and 
threshold variables such that, 

= 1, if Yi > Ai 
Xi = 0, otherwise, 

where , 

y=A0+£?, (17) 



and Y = (y^, ... , y;,) ' . The model outlined in (29) is identical to the common 
factor model with the exception that Y is unobserved. Assuming that 
d ~ MVN(0,I), E ~ MVN(0,r2), where is a diagonal matrix of residual 
variances, and cov(0,E) = 0, the covariance matrix E among the Y latent 
variables can expressed as. 



E(y) =A<&A'+T. 



(18) 



Therefore , 



(19) 



The probability of a correct response based on Christof fersson ' s model is 
given by. 



P{y =1)^[ L^e-^'/ 



dx. 



(20) 



The probability of correctly answering a pair of items is given by. 



P(yr-i, =/ /i^^i^^"^''^'^^'^^^- <2i) 

Christof fersson (1975), using the tetrachoric expansion (Kendall, 1941) re- 



id 
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expresses (21) as, 



P{y^=l, y.=l) = Sof,- x,{hi)x^{hj) 

s=0 



(22) 



where is the s-th tetrachoric function. Given the rapid convergence of the 
series, Christof fersson (1975) states that we may cut the ejqpansion after L 
terms and use. 



L-l 



P(y^=l, y-l) = Sof,- x^{hi)xjhj) 
s=o 



(23) 



The parameters of Christof fersson ' s (1975) model can be estimated using 
a generalized least-squares (GLS) estimation procedure that minimizes the fit 
function. 



F= ip-P) 's;^ ip-P) , 



(24) 



where , 

Se= a consistent estimator of H^, the residual covariance matrix; 
P= a vector of expected item proportions correct Pj and joint item 
proportions Pj^; 

p= a vector of observed item proportions correct pj and joint item 
proportions Pj^; 

Muthen (1978; 1983; 1984; 1988) has proposed a GLS estimator that is 
equivalent to that outlined by Christof fersson (1975) but corputationally more 
efficient. According to Muthen (1978), the parameters of the factor analytic 
model for dichotonious variables can be estimated by minimizing the weighted 
least-squares fit function. 



(25) 



where , 

0= Population threshold and tetrachoric correlation values; 

s= Sample estimates of the threshold and tetrachoric correlation 
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values ; 

Wj= A consisterit estimator of the asynptotic covariance matrix of s, 
multiplied by the total sample size. 

This approach, also referred to as GLS estimation using a full-weight 
matrix approach (Ki'-ithen, 1988) , is asynptotically equivalent to 
Christof f ersson ' s solution and slightly less demanding in terms of 
cotrputational requirements. It is referred to as a full -weight matrix approach 
because, as Muth#n (1988) states, the GLS estimator utilizes a weight matrix 
of size p* X p*, where p* corre.Tponds to the total number of elements in the s 
vector. 



Advantages and limitations of Christof f ersson ' s / Muthen's factor analytic 
model for dichotomous variables 

The GLS estimation procedure, unlike ULS, utilizes not only terms from 
the one-way and two-way margins but also from the three-way and four-way 
margins, that is, the joint proportions correct for three and four items taken 
at the same time. 7\s Mislevy (1986) states, the use of a greater amount of 
information in the estimation procedure is especially advantageous when one 
atteitpts to extract more from the data, that is, with solutions that contain 
fewer items, examinees or more factors (with other conditions held constant) . 

Also, statistical tests of model fit are readily available. The F 
function minimized in the GLS solution (c.f . equations 33 + 34) asynptotically 
follows a chi-square distribution, with df = k(k-l) /2 - t, where k is equal to 
the number of items and t, the mamber of parameters estimated in the model. In 
addition, standard errors for the parameters estimated in the model can be 
obtained quite easily. 

Finally, Muthen's solution is incorporated in the conputer program 
LISCOMP (Muth#n, 1988) . As was the case with NOHAEy^i (Eraser & McDonald, 1988) , 
LISCOMP (Muth#n, 1988) enables the user to fit both exploratory and 
confirmatory unidimensional or multidimensional models. Also, the output from 
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a typical LISCCMP run contains the cotnmon factor model parameter estimates and 
standard errors as well as a residual correlation matrix and a chi-square 
statistic which allows the user to assess the degree of fit of a given model 
or carpeting models. 

However, the GLS estimation is conputationally very intensive. Although 
Muthen's (1978) solution is vaore efficient than Qiristof f ersson ' s (1978) , the 
procedure, as inplemented in LISCCMP (Muthen, 1988) , is still inpractical 
using a personal cotrputer with tests containing more than 25 items (Mislevy, 
1986; Muthen, 1988) . 

Also, though GLS makes use of more information to fit the ona- and two- 
way margins than does ULS, it still ignores higher level interactions and, in 
that sense, does not fullv utilize all of the available information. However, 
as was the case for ULS estimation, it is quite possible that this loss of 
information is inconsequential. 

Full-infoxmation item factor analysis 

Bock and Aitkin (1981) proposed, based on the following m-factor model 
(for dichotomous data) , 

yji=^iAj' ^12^23 ^iAj+Eji' (26) 

that an unobservable response process 1^^ for person j to item i is a linear 
function of m normally distributed latent variables 6j = [6^^ O^^-, 0„j] 
and factor loadings A.^ = [X^, A^^, X^J . This latent response process, y-a 

is related to the binary (observed) item response Xj^ through a threshold 
parameter, Yi f^^^ item i, in the following fashion: 

if Yji >- yi. then x^-^ = 1, 
if yji < yn then x^-^ = 0. 
The probability that examinee j with abilities 0^ = \d^^, O^^-, 0^^.] 
will correctly answer item i is given by the function. 
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P(x,.-l|e,.)=*(Yi-S^iAi / o,), (27) 

Jc=l 

where 4 corresponds to unit normal cumulative distribution and is the 
standard deviation of the unobserved random variable ~ N(0,o2^) . 

Bock and Aitkin (1981) proposed a marginal maximum likelihood (MML) 
procedure to estimate the parameters in the model based on Derrpster, Laird and 
Rubin's (1977) EM algorithm. The threshold and factor loadings are estimated 
so as to maximize the following function, 

L^=p{x) = f^- — — - Pi\ p;% (28) 

where, r^ is the frequency of response pattern s and is the marginal 
probability of the response pattern b sed on the item parameter estimates. The 
function outlined in (27) , with the MML parameter estimates by meems of the EM 
algorithm, is commonly referred to as full -information item factor analysis 
(Bock, Muraki, & Gibbons, 1988) and has been irrplemented in the cottputer 
program TESTFACT (Wilson, Wood, & Gibbons, 1987) . 

Advantages and limitations of Bock and Aitkin's (1981) / Bock. Gibbons and 
Mur aki 's (1988) full-information item factor analysis 

One of the key advantages of full-information item factor analysis 
(FIFA) is that it utilizes all available information in the estimation 
procedure. Contrary to the two least-squares models previously outlined, which 
are restricted to lower-order marginals, FIFA is based on the estimation of 
item response vectors and hence uses all available information in the data. 

Also, the procedure is irrplemented in the cortputer program TESTFACT 
(Wilson, Wood, & Gibbons, 1987). The output fran a TESTFACT analysis contains, 
among other things, classical item statistics and factor analytic parameter 
estimates as well as their associated standard errors. In addition, a 
likelihood-ratio chi-square test is provided to help the user determine the 
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fit of a iiodel, or of conpeting modsls. 

However, the use of all information contained in the 2? item vectors, 
where p is equal to the nurnber of items, by FIFA requires that there should be 
no eiipty cells which is usually not feasible unless some collapsing is done. 
In addition, as Mislevy (1986) and Berger & Khol (1990) have noted, the 
goodness-of-f it statistic cotrputed by TESTFACT will be very unreliable with 
data sets containing more than 10 items due to the small expected number of 
examinees per cell. More precisely, Mislevy (1986) states that the 
approximation to the chi-square distribution might be poor in this instance. 
Wilson, Wood and Gibbons (1987) also caution against relying on the G^ fit 
statistic when a large nimiber of cells have expected frequencies near zero. In 
that instance, the authors reconmend using the G^ difference test (cotrparing 
two specific models) given that it follows a chi-square distribution in large 
sanples, even in the presence of a. sparse frequency table. 

Conclusion 

IRT models have been used extensively in the past few decades not only 
in the development and analysis of educational test items but also in a host 
of other applications such as for the equating of alternate test forms and the 
detection of differentially functioning items. 

Several researchers have suggested, however, that common IRT models are 
really specific cases of a more gerieral NLFA model (Goldstein & Wood, 1989; 
Knol & Berger, 1991; McDonald, 1967; in press; Takane & De Leeuw, 1987) . The 
research conducted by the latter authors clearly shows that familiar IRT 
models, such as the normal ogive and logistic functions, can easily be 
expressed with a factor analytic parameterization. The findings obtained in 
these studies would therefore seem to suggest that NLFA might provide a useful 
framework with which to address measurement -related issues that had been 
primarily investigated using IRT models. 

Three factor analytic models were briefly outlined.. More precisely, 
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McDonald's (1967; 1982b) polynomial approximation to a normal ogive model, 
Christoffersson's (1975) /Muthen's (1978) factor analysis model for dichotomous 
variables and Bock and Aitkin's (1981) /Bock, Gibbons & Muraki's (1988) full- 
information factor analytic model, were described. In addition, the major 
strengths and weaknesses of each model were delineated. Based on this 
information, are there any conditions that might dictate the use of one model 
over another? 

The main advantage associated with McDonald's polynomial approximation 
to a normal ogive model, that is, the relative economy of the ULS estimator, 
also constitutes its primary shortcoming. In other words, as Mislevy (1986) 
stated, the higher degree of conputational efficiency is achieved at the 
sacrifice of information. The model utilizes lower-order marginals in the 
estimation process and consequently ignores higher-order relationships among 
the data. However, there is some eitpirical evidence to suggest that "limited- 
information" factor analytic parameter estimates do not differ substantially 
from those obtained using the theoretically sounder "full -information" method 
as irrplemented in the cortputer program TESTFACT (Wilson, Wood, & Gibbons, 
1987; Knol & Berger, 1991) . 

Also, the absence of standard errors for the estimated factor analytic 
parameters and of a fit statistic to gauge the overall adequacy of a iiradel, 
are major disadvantages of McDonald's model, as irrplemented in the cortputer 
program NOI-iARM (Eraser & McDonald, 1988). Nonetheless, approximi. _a standard 
errors have been proposed as useful guides in assessing parameter estimation 
accuracy (Balassiano & Ackerman, 1995b; McDonald, 1994) . Also, two approximate 
chi-square statistics, based on the residual matrix obtained after fitting an 
m-factor model to an item response matrix using NOHARM (Fraser & McDonald, 
1988) proved to be very accurate with respect to correctly identifying the 
number of dimensions underlying simulated data sets in specific conditions. Of 
course, these chi-square statistics are weak in their theoretical foundation 
due to the fact that they're based on ULS estimation. However, Browne (1977) 
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has indicated that in many cases, these chi-square statistics are fonnally 
equivalent to those derived from a GLB estimation and that they differ only 
slightly. Therefore, from a practical perspective, these approximate chi- 
square statistics might be useful tools to those interested in fitting 
McDonald's model to item response matrices. 

The factor analytic model for dichotomous variables proposed by 
Qiristof fersson (1975) and amended by Muthen (1978) is, from a theoretical 
standpoint, superior to McDonald's approach in that the GLB estimation 
procedure yields a valid chi-square goodness-of-f it statistic as well as 
legitimate standard errors for the estimated parameters. Nonetheless, the 
model is still based on "limited information" in that it ignores higher-level 
interactions in the data. Also, the computational requirements of Muthen 's GLS 
solution as irrplemented in LISCOMP (Mutlien, 1988) are quite exacting: they 
increase proportionally to the number of factors and with the fourth power of 
the number of items. This led Mislevy (1986) to suggest that Muthen' solution 
might be adopted with tests that have a relatively small item to factor ratio. 

Finally, the full -information factor analytic model proposed by Bock and 
Aitkin (1981) and Bock, Gibbons, an Muraki (1988) is, based on theoretical 
grounds, the strongest of the approaches outlined, given that it does, as the 
name inplies, make use of all available information contained in the 2^ item 
response vectors. However, in most applications, the use of the full- 
information is usually not feasible unless collapsing of cells is undertaken. 
Also, the conputational requirements associated with the MML estimation 
procedure inplemented in TESTFACT, increase geometri jally with the number of 
factors specified in the model but only linearly with the number of items and 
response vectors. Hence, Mislevy (1986) advises using this procedure with 
longer tests and more parsimonious models. 

In summary, it would appear as though a greater number of errpirical 
studies should be undertaken to compare the various NIjFA models with respect 
to how accurately they can recover simulated parameter values, before making 
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any definite recoranendations as to which procedure should be favored given a 
specific set of conditions, i.e., test length, sairple size, factor model, etc. 
It is possible that the theoretically sounder models, e.g., full -information 
factor analysis, might not yield substantially more accurate parameter 
estimates than methods that are based on lower-order marginals, e.g., 
McDonald's (1967; 1982b) and Muthen's (1978; 1988) approaches. There is some 
preliminary evidence to support this claim (Boulet & Gessaroli, 1992; Gibbons, 
1984; Knol & Berger, 1991) . Some authors have even suggested that factor 
loadings obtained from a linear factor analysis of phi and tetrachoric 
correlation matrices did not differ noticeably from those derived using 
LISCOMP (Ivluthen, 1988; Parry & McArdle, 1991) or TESTFACT (Wilson, Wood, & 
Gibbons, 1987; Knol & Berger, 1991) . However, more studies are needed in this 
area to clearly identify the conditions in which one method might outperform 
another . 

In addition, the usefulness of these models in addressing common 
measurement-related problems should be investigated. For exartple, LISCOMP 
(Muthen, 1988) and TESTFACT (Wilson, Wood, & Gibbons, 1987) provide chi-square 
goodness -of -fit statistics in order to aid the practitioner in determining 
which model best accounts for the item response probabilities. Also, similar 
fit statistics have been proposed to accompany McDonald's (1967; 1982b) NLFA 
model (De Chartplain, 1992; Gesaroli & De Chartplain, 1995) . Given that 
unidimensionality of the latent ability space is one of the main postulates 
underlying most IRT models, it would seem inportant to evaluate the degree of 
accuracy with which each of these fit statistics is able to correctly identify 
or reject this assuirption under a variety of simulated conditions. Similarly, 
it might be interesting to assess the degree of effectiveness of these fit 
statistics in detecting violation of local independence. A frequent problem 
that confronts practitioners is how to best model item response data that 
contain sets that is, where several items refer to a common stem, e.g., a 
reading conprehension passage. The factor analytic framework might provide the 
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means of effectively dealing with local item dependence through items loading 
on a secondary dimension, for exanple. The usefulness of the NLFA framework in 
addressing these types of issues will be illustrated in the next three 
presentations of this syrrposium. 

An overview of the three syrtpositim presentations 

The first paper will be centered on outlining methods available for the 
assessment of dimensionality that are based on NLFA. Exanples of how to use 
these procedures to test for specific dimensional structures will be 
illustrated using data from a national testing program. 

The next paper will conpare the degree of accuracy of parameter 
estimates when based on "limi.ted- information" (NOHARM) and "full-information" 
(TESTFACTT) factor analytic models for simulated unidimensional and 
multidimensional data sets. In addition, the use of these methods will be 
depicted with actual achievement test data. 

The final paper will focus on explaining how the factor analytic 
framework might be useful in dealing with local item dependence (LID) . 
fipecif ically, the identification of LID u&ing NLFA will once more be 
illustrated with data from a national testing program. Also, methods of 
obtaining "purified" estimates of reliability and standard errors of 
measurement as well as ability (i.e., without LID contamination) will be 
outlined. 

It is hoped that these presentations will underscore the usefulness of 
NIjFA in addressing the above mentioned problems with actual achievement test 
data and stimulate discussion with respect to these areas, thus hopefully 
fostering future research. 
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^jpendix A 

Takane and De Leew's (1987) IRT-NLFA proof 

Let X = (xi, . . . , icn) be a random vector of response patterns to n binary- 
items on a test. Each Xi is assigned a value of 1, if the examinee correctly 
answers the item, or 0, if there is an incorrect response. Let u be an m- 
conponent random vector of abilities (irKn) with its density function denoted 
by g(u) . u is unobservable directly, but is assumed to follow a multivariate 
normal distribution with mean 0 and covariance J (identity matrix) ; that is u 
~ N(0, (J)) . The domain of u (denoted by V) is the multidimensional region 
defined- by the direct product of (-","). In IRT, the two-parameter normal 
ogive model specifies the marginal probability that x = x (Bock & Aitkin, 
1981; Bock & Lieberman, 1970) as, 

Pr{x=x)=fpr{x=x\u)g{u)du, (29) 

where Pr(x=x|u) is the conditional probability of observing response pattern x 
given u = u. Also, it is assumed that, 

n 

Pr{x=x\u)=]J (p^(u) )-'^Ml-Pj(u) )'-^S (30) 

i 

(that is, local independence) with, 

Pj(u) =r*'"'^(i)(2) dz=iif ia'u+b) , (31) 

J —00 

where (j) is the density function of the standard normal distribution and *, the 
normal ogive function (i.e., the cumulative distribution function of the 
standard normal distribution) . 

On the other hand, Takane and De Leeuw (1987) state that in the factor 
analytic model proposed by Christoffersson (1975) , the marginal probability of 
response pattern x is specified as. 
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Pr(x=x)=f h(y)dy, (32) 
Jr 

where R is the multidimensional region of integration and 

y=Cu+e. (33) 

Equation 15 corresponds to the cooinon factor analytic model with C being 
the matrix of factor loadings, u, the vector of factor scores (abilities in an 
IRT framework) and s, the random vector of uniqueness corponents distributed 
as N(0,Q2) where is further assumed to be diagonal (linear local 
independence) , and u and 6 are independent of each other. It follows that, 

y~N{Q,CC'+Q^) , (34) 

(marginal distribution of y) and 

y\u~N{Cu,Q^-) , (35) 

(conditional distribution of y given u = u) . The continuous random variables, 
y are dichotomized by jq = 1, if yi ^ or jq = 0, if yi < ri for i=l, . . . ,n, 
where ri is the threshold parameter for variable i. Therefore, R, the region 
of integration above, is the multidimensional parallelepiped defined by the 
direct product of intervals, R^ = (ri,") if Xi=l and i?j=(-«>,ri) if iq -- 0. Now 

(11) including (12) and (13) is equivalent to (14) with y defined in (15) . We 
first prove that (14) - (11) . The authors show that from (14) we have 

Pz {x=x) =f h (y) dy 

J R 

= [ if f{y\u)g{u)du)dy 

J R J U 

= [ g(u) (f f{y\u)dy)du, (36) 

J U J R 

where f(y|u) is the conditional density of y given u=u. But because of (17), 
it can be shown that. 
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f{y\u)dy=l[f f,{y,\u)dy, 



(37) 



where , 



f,{y,\u)dy,=^{ 




) , 



(38) 



for i = 1, .. .,r!. In this instance g^i is the i-th diagonal element of . 
Equation (19) is thus equivalent to (13) by setting 



for i = 1, . . . ,n. 

Takane and De Leeuw (1987) state that it might appear as though factor 
analysis with and gi(i=l, . . .,n) has more parameters than IRT with only 

ai and bi(i=l, . . .n) . However, according to the authors, when the data are 
dichotomous, the variance of yi cannot be estimated due to the lack of 
relevant information in the data, and thus, qi can be set to an arbitrary- 
value. Hence, the effective number of parameters is identical in both models. 
In conclusion, the authors mention that the equivalence of marginal 
p.Tobabilities in IRT and FA models holds approximately with logistic (IRT) 
models also, as long the logistic distribution provides a good approximation 
of the normal distribution (i.e. normal ogive) . 




(39) 
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(40) 



